261 research outputs found
UPMASK: unsupervised photometric membership assignment in stellar clusters
We develop a method for membership assignment in stellar clusters using only
photometry and positions. The method, UPMASK, is aimed to be unsupervised, data
driven, model free, and to rely on as few assumptions as possible. It is based
on an iterative process, principal component analysis, clustering algorithm,
and kernel density estimations. Moreover, it is able to take into account
arbitrary error models. An implementation in R was tested on simulated clusters
that covered a broad range of ages, masses, distances, reddenings, and also on
real data of cluster fields. Running UPMASK on simulations showed that it
effectively separates cluster and field populations. The overall spatial
structure and distribution of cluster member stars in the colour-magnitude
diagram were recovered under a broad variety of conditions. For a set of 360
simulations, the resulting true positive rates (a measurement of purity) and
member recovery rates (a measurement of completeness) at the 90% membership
probability level reached high values for a range of open cluster ages
( yr), initial masses (M_{\sun}) and
heliocentric distances ( kpc). UPMASK was also tested on real data
from the fields of the open cluster Haffner~16 and of the closely projected
clusters Haffner~10 and Czernik~29. These tests showed that even for moderate
variable extinction and cluster superposition, the method yielded useful
cluster membership probabilities and provided some insight into their stellar
contents. The UPMASK implementation will be available at the CRAN archive.Comment: 12 pages, 13 figures, accepted for publication in Astronomy and
Astrophysic
The first analytical expression to estimate photometric redshifts suggested by a machine
We report the first analytical expression purely constructed by a machine to
determine photometric redshifts () of galaxies. A simple and
reliable functional form is derived using galaxies from the Sloan
Digital Sky Survey Data Release 10 (SDSS-DR10) spectroscopic sample. The method
automatically dropped the and bands, relying only on , and
for the final solution. Applying this expression to other SDSS-DR10
galaxies, with measured spectroscopic redshifts (), we achieved a
mean and a scatter when averaged up to . The method was
also applied to the PHAT0 dataset, confirming the competitiveness of our
results when faced with other methods from the literature. This is the first
use of symbolic regression in cosmology, representing a leap forward in
astronomy-data-mining connection.Comment: 6 pages, 4 figures. Accepted for publication in MNRAS Letter
Detecting stars, galaxies, and asteroids with Gaia
(Abridged) Gaia aims to make a 3-dimensional map of 1,000 million stars in
our Milky Way to unravel its kinematical, dynamical, and chemical structure and
evolution. Gaia's on-board detection software discriminates stars from spurious
objects like cosmic rays and Solar protons. For this, parametrised
point-spread-function-shape criteria are used. This study aims to provide an
optimum set of parameters for these filters. We developed an emulation of the
on-board detection software, which has 20 free, so-called rejection parameters
which govern the boundaries between stars on the one hand and sharp or extended
events on the other hand. We evaluate the detection and rejection performance
of the algorithm using catalogues of simulated single stars, double stars,
cosmic rays, Solar protons, unresolved galaxies, and asteroids. We optimised
the rejection parameters, improving - with respect to the functional baseline -
the detection performance of single and double stars, while, at the same time,
improving the rejection performance of cosmic rays and of Solar protons. We
find that the minimum separation to resolve a close, equal-brightness double
star is 0.23 arcsec in the along-scan and 0.70 arcsec in the across-scan
direction, independent of the brightness of the primary. We find that, whereas
the optimised rejection parameters have no significant impact on the
detectability of de Vaucouleurs profiles, they do significantly improve the
detection of exponential-disk profiles. We also find that the optimised
rejection parameters provide detection gains for asteroids fainter than 20 mag
and for fast-moving near-Earth objects fainter than 18 mag, albeit this gain
comes at the expense of a modest detection-probability loss for bright,
fast-moving near-Earth objects. The major side effect of the optimised
parameters is that spurious ghosts in the wings of bright stars essentially
pass unfiltered.Comment: Accepted for publication in A&
Using gamma regression for photometric redshifts of survey galaxies
Machine learning techniques offer a plethora of opportunities in tackling big
data within the astronomical community. We present the set of Generalized
Linear Models as a fast alternative for determining photometric redshifts of
galaxies, a set of tools not commonly applied within astronomy, despite being
widely used in other professions. With this technique, we achieve catastrophic
outlier rates of the order of ~1%, that can be achieved in a matter of seconds
on large datasets of size ~1,000,000. To make these techniques easily
accessible to the astronomical community, we developed a set of libraries and
tools that are publicly available.Comment: Refereed Proceeding of "The Universe of Digital Sky Surveys"
conference held at the INAF - Observatory of Capodimonte, Naples, on
25th-28th November 2014, to be published in the Astrophysics and Space
Science Proceedings, edited by Longo, Napolitano, Marconi, Paolillo, Iodice,
6 pages, and 1 figur
The overlooked potential of Generalized Linear Models in astronomy-II: Gamma regression and photometric redshifts
Machine learning techniques offer a precious tool box for use within astronomy to solve problems involving so-called big data. They provide a means to make accurate predictions about a particular system without prior knowledge of the underlying physical processes of the data. In this article, and the companion papers of this series, we present the set of Generalized Linear Models (GLMs) as a fast alternative method for tackling general astronomical problems, including the ones related to the machine learning paradigm. To demonstrate the applicability of GLMs to inherently positive and continuous physical observables, we explore their use in estimating the photometric redshifts of galaxies from their multi-wavelength photometry. Using the gamma family with a log link function we predict redshifts from the PHoto-z Accuracy Testing simulated catalogue and a subset of the Sloan Digital Sky Survey from Data Release 10. We obtain fits that result in catastrophic outlier rates as low as ~1% for simulated and ~2% for real data. Moreover, we can easily obtain such levels of precision within a matter of seconds on a normal desktop computer and with training sets that contain merely tho nds of galaxies. Our software is made publicly available as a user-friendly package developed in Python, R and via an interactive web application. This software allows users to apply a set of GLMs to their own photometric catalogues and generates publication quality plots with minimum effort. By facilitating their ease of use to the astronomical community, this paper series aims to make GLMs widely known and to encourage their implementation in future large-scale projects, such as the Large Synoptic Survey Telescope
A probabilistic approach to emission-line galaxy classification
We invoke a Gaussian mixture model (GMM) to jointly analyse two traditional
emission-line classification schemes of galaxy ionization sources: the
Baldwin-Phillips-Terlevich (BPT) and vs. [NII]/H
(WHAN) diagrams, using spectroscopic data from the Sloan Digital Sky Survey
Data Release 7 and SEAGal/STARLIGHT datasets. We apply a GMM to empirically
define classes of galaxies in a three-dimensional space spanned by the
[OIII]/H, [NII]/H, and EW(H), optical
parameters. The best-fit GMM based on several statistical criteria suggests a
solution around four Gaussian components (GCs), which are capable to explain up
to 97 per cent of the data variance. Using elements of information theory, we
compare each GC to their respective astronomical counterpart. GC1 and GC4 are
associated with star-forming galaxies, suggesting the need to define a new
starburst subgroup. GC2 is associated with BPT's Active Galaxy Nuclei (AGN)
class and WHAN's weak AGN class. GC3 is associated with BPT's composite class
and WHAN's strong AGN class. Conversely, there is no statistical evidence --
based on four GCs -- for the existence of a Seyfert/LINER dichotomy in our
sample. Notwithstanding, the inclusion of an additional GC5 unravels it. The
GC5 appears associated to the LINER and Passive galaxies on the BPT and WHAN
diagrams respectively. Subtleties aside, we demonstrate the potential of our
methodology to recover/unravel different objects inside the wilderness of
astronomical datasets, without lacking the ability to convey physically
interpretable results. The probabilistic classifications from the GMM analysis
are publicly available within the COINtoolbox
(https://cointoolbox.github.io/GMM\_Catalogue/).Comment: Accepted for publication in MNRA
The overlooked potential of Generalized Linear Models in astronomy-II: Gamma regression and photometric redshifts
Machine learning techniques offer a precious tool box for use within astronomy to solve problems involving so-called big data. They provide a means to make accurate predictions about a particular system without prior knowledge of the underlying physical processes of the data. In this article, and the companion papers of this series, we present the set of Generalized Linear Models (GLMs) as a fast alternative method for tackling general astronomical problems, including the ones related to the machine learning paradigm. To demonstrate the applicability of GLMs to inherently positive and continuous physical observables, we explore their use in estimating the photometric redshifts of galaxies from their multi-wavelength photometry. Using the gamma family with a log link function we predict redshifts from the PHoto-z Accuracy Testing simulated catalogue and a subset of the Sloan Digital Sky Survey from Data Release 10. We obtain fits that result in catastrophic outlier rates as low as ∼ 1% for simulated and ∼ 2% for real data. Moreover, we can easily obtain such levels of precision within a matter of seconds on a normal desktop computer and with training sets that contain merely thousands of galaxies. Our software is made publicly available as a user-friendly package developed in Python, R and via an interactive web application. This software allows users to apply a set of GLMs to their own photometric catalogues and generates publication quality plots with minimum effort. By facilitating their ease of use to the astronomical community, this paper series aims to make GLMs widely known and to encourage their implementation in future large-scale projects, such as the Large Synoptic Survey Telescope
Periodic Astrometric Signal Recovery through Convolutional Autoencoders
Astrometric detection involves a precise measurement of stellar positions,
and is widely regarded as the leading concept presently ready to find
earth-mass planets in temperate orbits around nearby sun-like stars. The
TOLIMAN space telescope[39] is a low-cost, agile mission concept dedicated to
narrow-angle astrometric monitoring of bright binary stars. In particular the
mission will be optimised to search for habitable-zone planets around Alpha
Centauri AB. If the separation between these two stars can be monitored with
sufficient precision, tiny perturbations due to the gravitational tug from an
unseen planet can be witnessed and, given the configuration of the optical
system, the scale of the shifts in the image plane are about one millionth of a
pixel. Image registration at this level of precision has never been
demonstrated (to our knowledge) in any setting within science. In this paper we
demonstrate that a Deep Convolutional Auto-Encoder is able to retrieve such a
signal from simplified simulations of the TOLIMAN data and we present the full
experimental pipeline to recreate out experiments from the simulations to the
signal analysis. In future works, all the more realistic sources of noise and
systematic effects present in the real-world system will be injected into the
simulations.Comment: Preprint version of the manuscript to appear in the Volume
"Intelligent Astrophysics" of the series "Emergence, Complexity and
Computation", Book eds. I. Zelinka, D. Baron, M. Brescia, Springer Nature
Switzerland, ISSN: 2194-728
- …